Topic-Aware Multi-turn Dialogue Modeling
نویسندگان
چکیده
In the retrieval-based multi-turn dialogue modeling, it remains a challenge to select most appropriate response according extracting salient features in context utterances. As conversation goes on, topic shift at discourse-level naturally happens through continuous context. However, all known systems are satisfied with exploiting local words for utterance representation but fail capture such essential global topic-aware clues discourse-level. Instead of taking topic-agnostic n-gram as processing unit matching purpose existing systems, this paper presents novel solution which segments and extracts utterances an unsupervised way, so that resulted model is capable capturing need thus effectively track flow during conversation. Our modeling implemented by newly proposed segmentation algorithm Topic-Aware Dual-attention Matching (TADAM) Network, matches each segment dual cross-attention way. Experimental results on three public datasets show TADAM can outperform state-of-the-art method, especially 3.3% E-commerce dataset has obvious shift.
منابع مشابه
Multi-objective Topic Modeling
Topic Modeling (TM) is a rapidly-growing area at the interfaces of text mining, artificial intelligence and statistical modeling, that is being increasingly deployed to address the ’information overload’ associated with extensive text repositories. The goal in TM is typically to infer a rich yet intuitive summary model of a large document collection, indicating a specific collection of topics t...
متن کاملOrdering-Sensitive and Semantic-Aware Topic Modeling
Topic modeling of textual corpora is an important and challenging problem. In most previous work, the “bag-of-words” assumption is usually made which ignores the ordering of words. This assumption simplifies the computation, but it unrealistically loses the ordering information and the semantic of words in the context. In this paper, we present a Gaussian Mixture Neural Topic Model (GMNTM) whic...
متن کاملMulti-field Correlated Topic Modeling
Popular methods for probabilistic topic modeling like the Latent Dirichlet Allocation (LDA, [1]) and Correlated Topic Models (CTM, [2]) share an important property, i.e., using a common set of topics to model all the data. This property can be too restrictive for modeling complex data entries where multiple fields of heterogeneous data jointly provide rich information about each object or event...
متن کاملDailyDialog: A Manually Labelled Multi-turn Dialogue Dataset
We develop a high-quality multi-turn dialog dataset, DailyDialog, which is intriguing in several aspects. The language is human-written and less noisy. The dialogues in the dataset reflect our daily communication way and cover various topics about our daily life. We also manually label the developed dataset with communication intention and emotion information. Then, we evaluate existing approac...
متن کاملThe Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i16.17668